Assigned Reading

1. Equational Reasoning and State
2. The derivation

For next week (19 July 2022) you should read both of the article You Could Have Invented Monads!, available by clicking here.

I also suggest reading the below writeup (by Zoltan Kocsis) about equational reasoning and State.

1 Equational Reasoning and `State`

In Lecture 5, we defined our own version of the State s monad, using the following type declaration:

type State s a = s -> (s, a)

Keep in mind that the standard libraries define State in a slightly different way!

We also defined two functions for working with State s, bindS and yield, defined as follows:

bindS :: State s a -> (a -> State s b) -> State s b
bindS f rest = \origState ->
  let (newState, retVal) = f origState in
  rest retVal newState

-- yield returns its argument, and leaves the state unchanged
yield :: a -> State s a
yield v = \origState -> (origState, v)

-- get returns the current state without changing it
get :: State s s
get = \origState -> (origState, origState)

-- put replaces the state with the given argument (i.e. updates the state)
put :: s -> State s ()
put newState = \origState -> (newState, ())

Recall that we used these operations (which, we now know, correspond to the monadic >>= and return) to implement the label and label' functions, which do exactly the same thing. The latter uses State, the former does not, but they implement exactly the same function.

But as far as functions go, even label itself is quite complicated. To really understand how State~/~bindS work and what they buy us, we have to start with a much simpler function. In this worked example, our goal is to understand the increment function, which had two different implementations in the lecture.

One of these implementations is very simple and straightforward, while the other relies on bindS, get and put.

increment :: State Integer Integer
increment = \lun -> (lun + 1, lun)

increment' :: State Integer Integer
increment' =
  bindS get              $ \lun ->
  bindS (put (lun + 1))  $ \()  ->
  yield lun

Again, these two implement exactly the same function. You can try this by calling them on the repl: you'll see that e.g. increment 5 and increment' 5 both return the pair (6,5). Both of them are functions of type Integer -> (Integer, Integer). What bindS gives us is a generally better, more convenient way of implementing functions of the type s -> (s, a). But how does it work?

Well, the great thing about Haskell is that equational reasoning is always valid. This means that, in principle, you can wrap your head around bindS by expanding out its definition, using equational reasoning. This is the derivation we'll do below. Now, it won't be easy: to follow along with the derivation, you should have a Haskell repl open, and test that increment' still type-checks, compiles and has same output after every change that we make.

In the end, once you understand what's happening, you should attempt to re-do this derivation completely by yourself, without referring to my write-up.

2 The derivation

We will start out with the definition of increment'' below, and refactor it step-by-step via equational reasoning. Our goal is to get rid of all the State-specific stuff, the bindS, get, put and yield functions, and arrive to something that looks like the definition of increment. This will prove that the two functions do exactly the same thing.

increment'' :: State Integer Integer
increment'' =
  bindS get              $ \lun ->
  bindS (put (lun + 1))  $ \()  ->
  yield lun

2.1 Step 1

Since bindS get $ \lun -> ... just means the same thing as bindS get (\lun -> ...), we can factor out that second argument into a separate function using a where clause.

increment'' :: State Integer Integer
increment'' =
  bindS get restOfIncrement where
    restOfIncrement = \lun ->
      bindS (put (lun + 1))  $ \()  ->
      yield lun

2.2 Step 2

From here on, we'll be simplifying the part outside of the where block. The goal is to get rid of bindS from there. This can be done via some equational reasoning: I will take the definition of bindS f rest = ..., given above, and replace bindS get restOfIncrement with the right hand side of that definition. Here's what we get:

increment'' :: State Integer Integer
increment'' =
  \origState ->
    let (newState, retVal) = get origState in
    restOfIncrement retVal newState where
      restOfIncrement = \lun ->
        bindS (put (lun + 1))  $ \()  ->
        yield lun

2.3 Step 3

We got rid of bindS (outside the where block), but another State-related function, get, is still there. We can rid ourselves of the get function too, by using equational reasoning again. The definiton given above is the same as get x = (x,x).

increment'' :: State Integer Integer
increment'' =
  \origState ->
    let (newState, retVal) = (origState, origState) in
    restOfIncrement retVal newState where
      restOfIncrement = \lun ->
        bindS (put (lun + 1))  $ \()  ->
        yield lun

2.4 Step 4

We will now rename the origState argument to w. We're allowed to do this, of course: the names of function arguments don't matter. Why do we do this? Well, on one hand shorter names are easier to read. On the other hand, it will avoid a name clash much later (you'll see where, but there's no need to worry about it for now).

increment'' :: State Integer Integer
increment'' =
  \w ->
    let (newState, retVal) = (w, w) in
    restOfIncrement retVal newState where
      restOfIncrement = \lun ->
        bindS (put (lun + 1))  $ \()  ->
        yield lun

2.5 Step 5

We can now get rid of that annoying let binding, simply by substituting the variables newState and retVal with w.

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where
      restOfIncrement = \lun ->
        bindS (put (lun + 1))  $ \()  ->
        yield lun

2.6 Step 6

The part outside the where clause is fairly simple now: we don't see how one could simplify it further. So, let's switch our attention to the restOfIncrement function inside the where clause. How could we simplify that? Before we simplify it, I'll replace restOfIncrement = \lun -> ... with restOfIncrement lun = ... just to make things slightly easier to read.

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where

      restOfIncrement lun =
        bindS (put (lun + 1))  $ \()  ->
        yield lun

2.7 Step 7

Now, we'll do the same thing we did in the first step: replace bindS f $ ... with bindS f (...) to make it clear what the two arguments actually are.

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where

      restOfIncrement lun =
        bindS (put (lun + 1)) (\() -> yield lun)

2.8 Step 8

Time to get rid of the bindS via equational reasoning! Again, just use the right-hand side of the definition given in the lecture (or above, refer back to the code!)

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where

      restOfIncrement lun =
        \origState ->
          let (newState, retVal) = put (lun + 1) origState in
          (\() -> yield lun) retVal newState

2.9 Step 9

We got rid of bindS, but we still have other state-specific functions, e.g. put. We get rid of put using some more equational reasoning, by using the definition put y x = (y, ()) (You should of course convince yourself that this is the same definition as the one given in the lecture!)

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where

      restOfIncrement lun =
        \origState ->
          let (newState, retVal) = (lun + 1, ()) in
          (\() -> yield lun) retVal newState

2.10 Step 10

As before, we can get rid of a let binding by substituting newState with lun + 1 and retVal with ().

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where

      restOfIncrement lun =
        \origState ->
          (\() -> yield lun) () (lun + 1)

2.11 Step 11

Let's get rid of that weird-looking (\() ->) function application too. What does it mean anyway?

Well, () is a type, with one constructor, confusingly also denoted (). So what is \() -> yield lun? It's a function that takes an argument of type (), pattern matches it against the value () (which will always succeed since the only value of type () is the value ()) and then returns yield lun. We could have written \_ -> yield lun or \q -> yield lun, these functions all do the same thing.

Now, what happens when you evaluate (\() -> yield lun) ()? Well, you're just applying the function \() -> yield lun to the value (). So it will return yield lun. Here's what we get:

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where

      restOfIncrement lun =
        \origState ->
          yield lun (lun + 1)

2.12 Step 12

Our last bit of equational reasoning: we use the definition yield v x = (x, v) (as usual, convince yourself that this is the same as the one given above!)

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where

      restOfIncrement lun =
        \origState ->
          (lun + 1, lun)

2.13 Step 13

It should be clear where this is going now. Let's just get rid of the \origState -> lambda.

increment'' :: State Integer Integer
increment'' =
  \w ->
    restOfIncrement w w where
      restOfIncrement lun origState = (lun + 1, lun)

2.14 Step 14

Remember, we only introduced restOfIncrement to make things easier to read! Now it's no longer useful, so we can get rid of it, along with the whole where block. How? We can just replace restOfIncrement w w by the right hand side of the definition of restOfIncrement, which is (w + 1, w).

increment'' :: State Integer Integer
increment'' = \w -> (w + 1, w)

And there we have it! We can now see that increment'' and increment are clearly the same function!

2.15 What did we learn?

We learned that there is no magic inside bindS, it really is just another way of writing ordinary functions of the form s -> (s,a). We can think of such functions as functions that manipulate a piece of state of type s, and ultimately return something of type a.

Previously (think back to label in Lecture 5) we had to write these functions by maintaining that piece of state manually, using a repetitive let (newState, rv) = ... syntax. But Haskell was powerful enough to let us capture this repetitive activity in a function, bindS. Thanks to bindS, we no longer have to explicitly deal with "threading" this state through our application. The code is easier to write, easier to read (for those readers who wrapped their heads around State), and we even eliminated a class of potential bugs (where we give the wrong state argument).

Of course, it's silly to use all this machinery for a trivially simple function like increment , but the payoff is already substantial for slightly more complicated functions, such as label' vs label of Lecture 5.